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(54) Texturing systems for use in three-dimensional imaging systems 



(57) A method and apparatus are provided for gen- 
erating textured data for use in texturing an image. Tex- 
ture data Is first represented by arbitrary compressed 
codes in which selected compressed code values define 
principal colors and other compressed code values de- 
fine colors which can be formed by selected weighted 
averages of principal colors, the corresponding cun'ent 



values also being weighted averages of the code values 
of the selected principal colors. An output texel is inter- 
polated full of variety of Input texels with the interpolating 
step being effecting using compressed code values. 
These code values are subsequently decompressed to 
give the actual color values. 
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Description 

Field of the Invention 

5 [0001] This invention relates to texturing systems for use In three-dimensional imaging systems, such as are used 
in computer displays. Such texturing systems are for example employed In the display of computer games. 

Prior Art 

10 [0002] The following prior documents and materials should be refen-ed to for further Infomiation on the background 
to this technology. Certain of them are referred to where convenient by number In the subsequent description: 

[1] Catmull, E., "A Subdivision Algorithm for Computer Display of Curved Surfaces", Ph.D. Thesis, report 
UTEC-Csc-74-133, Computer Sciences Department, University of Utah, Salt Lake City, UT, December 1 974. 
IS [2] Blinn, J.F. and M.E. Newell, "Texture and Reflection In Computer Generated Images", CACM, 1 9(1 0), October 

1976, 542-547. 

[3] Wolberg, G., "Digital Image Warping",, IEEE Computer Society Press, Los Alamitos, CA, 1990. 
[4] Williams, L„ "Pyramidal Parametrlcs^ SIGGRAPH 83, pages 1-11. 

[5] VideoLogic, "Apocalypse 3Dx data", available from VideoLogic Limited, Kings Langley, England. 
20 [6] Microsoft Corporation, 'Texture and Rendering Engine Compression (TREC)*, Microsoft Technical Brief, internet 

address: www.microsoft.com/hwdev/devdes/whntrec.htm 

[7] Beers, A. C. Agrawala, M. and Chaddha, N., "Rendering from Compressed Textures", 1 996 Computer Graphics 
Annual Conference, page 373. 

[8] Hakura, 2.S. and Gupta, A. 'The design and analysis of a cache architecture for texture mapping", Computer 
25 Architecture News, Vol.25, no.2, pp.1 08-20, May 1997. 

[9] United States Patent US-A-5,61 2,747 -Method and apparatus for vector quantization caching In a real time 
video coder". 

[1 0] United Kingdom Patent Application GB-A-2,297.886 Texturing and shading of 3-D Images Applicants: Vid- 
eoLogic Limited. 

30 [11] Gray, R. M., "Vector Quantization", IEEE Transactions on Communications, January 1980, pp.4-20. 

[12] Koegel Buford, J., "Multimedia Systems", Addison-Wesley publishing company, 1994. 
[1 3] Foley, F. and van Dam, A., "Computer Graphics Principles and Practice". Addison-Wesley publishing company, 
1990. 

[1 4J Microsoft, "DirectX 6.0 SDK Documentation", Microsoft Corporation. 1 Microsoft Way, Redmond, USA. Intemet 
35 address: www.microsoft.com/directx. 

Background of the Invention 

[0003] Computer based images are commonly fonned of an array of picture elements or pixels. Each surface to be 

40 displayed may be represented by the pixels within a polygon, commonly a triangle. The surface is given color, texture 
and/or shading by an operation known as "texture mapping". Textures are stored as arrays of pixels, conveniently 
temied texels {texture pixels). Thus "texture mapping" involves the mapping of a 2D (two-dimensional) array of texels 
onto another 2D an-ay of pixels representing asurface in a 3D (three-dimensional) scene. This technique was developed 
by Catmull [ref. 1 ] and refined by Blinn and Newell [ref. 2]. Perspective texture mapping Involves rotating and translating 

45 the texture map so that Its perspective appears correct on the final Image. Texture mapping Improves the realism of 
the scene by giving the surface of an object a realistic finish. An example of this Is mapping a marble texture to the 
surface of a statue, giving the statue the appearance that it is made of mariale. For a large scene many different texture 
bitmaps are required to represent all the different textures which might be present in the scene. 
[0004] As just noted, a 3D scene is usually represented by a number of polygons or triangles. In order to fill a polygon 

50 with a texture, each pixel on its surface is used to calculate the co-ordinate of a texel in the texture map. The nearest 
texel to the one calculated in the texture map may be used to shade the finally displayed pixel. This is called point 
sampling. Alternatively, bilinear filtering or bilinear Interpolation may be used to Improve the quality of the textured 
Image. In bilinear filtering the point In the texture map from which the 2D pixel is to be mapped onto the 3D surface is 
calculated to sub-pixel accuracy. Bilinear filtering or interpolation is then used to blend the four closest pixels to this 

55 calculated position in the texture map in order to attain a more accurate representation of the pixel color. This is illus- 
trated in the accompanying Figure 1 , where the texels A, B, C, and D are blended to provide a texel value for a pixel 
at point X on the two-dimensional Image plane. This operation of bilinear, i.e. two-axis, filtering (or Interpolation) is 
further described in ref. 3. 
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[0005] Trllinear (three-axis) f iltering is the same process over the four closest pixels on two different mlp-map levels 
[ref. 4]. This Is illustrated in Figure 2 of the present application. Mlp-maps are copies of the original texture map which 
have been pre-processed by being filtered so as to be successively reduced to half the resolution. MIP here stands 
for MULTUM IN PARVO (much In a small place). This is repeated until the resulting image Is 1 pixel in size (this assumes 
5 that the texture is square and of a size which Is a power of 2), so that there are a hierarchical series of the mip-maps. 
Figure 3 shows an example of a brick texture at 1 28 x 1 28 resolution with the associated lower mlp-map levels. A mip- 
map can be thought of as a pyramid. 

[0006] Texture filtering has the effect of reducing the occurrence of aliasing when sampling textures. For more Infor- 
mation on aliasing see ref. 3. 

10 [0007] Three-dimensional image generation is computationally intensive. Animated 3D images for games and Com- 
puter Aided Design (CAD) applications are becoming increasingly expensive in temns of processing power, as scenes 
become more photo-real and images are required to respond In real-time. A large number of floating point calculations 
are required to detemnine the geometry of the polygon structure in the scene and a large number of arithmetic operations 
are required to fill and shade the polygons. Dedicated hardware is available [ref. 5] that can perfomn these operations 

15 many times more efficiently than software. Accesses to stored databases are also a limiting factor to performance. 
Local memory in dedicated hardware can reduce the effect ot any memory access bottlenecks. Texture mapping is 
particularly memory IntensWe especially when perfomning af iltering (that Is. Interpolation) operation where many texture 
pixels are read for every pixel that Is mapped onto the display. 

[0008] The size of a 2D texture map data Is therefore reduced by texture compression so that it can be located Into 
20 a smaller memory space. A small memory requirement leads to lower system costs. The original texture map can then 
be retrieved from the compressed data by decompression. As 3D scenes become more realistic, texture maps become 
larger and more numerous, making the use of texture compression more Important. Several schemes have already 
been developed including Texture and Rendering Engine Compression (TREC) from Microsoft [ref. 6]. Beers [ref. 7] 
first discussed the technique of rendering images from compressed textures. 
25 [0009] It is convenient at this point to consider, and define, the various types of memory that are available to the 
system designer The term "local memory" refers to solid state semiconductor memory located close to the memory 
control semiconductor device or circuit. The \erm "internal memory" refers to memory located within the particular 
semiconductor device being referred to. "External memory" is any memory outside the semiconductor device. Local 
memory can be DRAM based. DRAM is an acronym for Dynamic Random Access Memory, which is a solid-state 
30 semteonductor. Synchronous DRAM (SDRAM) enables data accesses to be co-ordinated by a clock signal. SDRAM 
has a higher access bandwidth capability than DRAM due to its pipelined architecture but Is more expensive. Local 
memory and internal memory can be DRAM or SDRAM based. External memory can be sold-state or a mass storage 
array such as a hard disk. Semiconductor memory is very expensive and makes up a large percentage of the overall 
cost of a computer system. 

35 [0010] DRAM is addressed over a multiplexed address bus, that is. the address needed to access an individual data 
Item Is transmitted to the memory device in two parts. The core memory array in the DRAM device is a rectangular 
matrix where a single data item is addressed when a row control line and a column control line are activated at the 
same time. This requires a separate row and column address. If the row address does not change between sequential 
accesses, then only the column address needs to be transmitted. A row of data In the DRAM array Is known as a page. 

40 When the row address remains unchanged between accesses, the accesses are said to be "in page". "In page" ac- 
cesses are much quicker than those that span two or more pages, and memory system designers endeavour to keep 
bursts of accesses in page. Some memory devices, such as SDRAM, make use of multiple memory banks to improve 
perfonmance. Each memory bank can have Its own page open, permitting data accesses to separate areas of memory 
without breaking page. 

45 [0011] One technique used to improve memory performance Is "Memory Caching" In which the result of all external 
memory accesses is stored in a separate internal memory. This internal memory can be accessed much faster than 
extemal memory. If the result of a particular memory access already resides in the cache, then the cache is read instead 
of the external memory. This has the effect of reducing traffic to the external memory, and therefore reducing the 
"bandwidth" requirements of that memory. The bandwidth requirement of a memory system is often directly related to 

50 the cost of that system. In a fixed bandwidth system an Increased bandwidth requirement can lead to a reduction of 
overall system performance. 

[0012] Texturing is the most performance-intensive process of 3D imaging as it requires all textures to be retrieved 
from memory. Techniques such as trilinear and bilinear filtering (interpolation) require up to eight texture pixels or texels 
to be retrieved from memory for every pixel projected onto the display, as described above and illustrated in Figures 
55 1 and 2. Texturing therefore requires a very high bandwidth path into memory. Texture caching can be employed to 
reduce the texturing bandwidth requirement and increase system perfonmance. The optimum perfomnance objective 
Is to be able to read all necessary texels In one processing pipeline clock cycle. Some woric has already been done 
on studying the effects of using a cache to Improve the perfomnance of texture accesses [ref. 8]. Hakura demonstrates 
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how caches can be highly effective for texture mapping graphics, and concludes that the effective memory to bandwidth 
ratio can be reduced by a factor of three to fifteen using certain caching strategies. 

[0013] As previously indicated, texture mapping is used to improve 3D image quality by mapping high detail onto a 
surface of a 3D object. This should be done without complicating the object's geometry. However, texture mapping 
produces a wide variety of visual artefacts, Including aliasing frBf. 13]. Bilinear filtering [ref. 3] Is used to improve the 
quality of the resulting image but there remain many artefacts that blllnearfiltering cannot solve, including depth aliasing 
Depth aliasing is the result of the texture getting more compressed as an object moves furtherfrom the viewpoint. This 
form of aliasing can be resolved by use of mip-maps [ref. 4J, but there Is still a problem called mip-banding. Mlp-bandlng 
occurs during the transition period between mip-maps when the texture changes from one level of detail to another 
This may appear for example on a road, seen In the foreground, which disappears Into the distance. Successive mip- 
maps are used along the road and the transition from one mip-map to the next can be visible. This problem can be 
solved with the application of trilinear filtering [ref. 4], which interpolates the level of detail between mip-maps as 
described above. 

1001 4] The best form of trilinear filtering Is that which is perfomied on a per-plxel basis. This requires eight texture 
pixels (texels) to produce the final on-screen pixel. As these texels can be located anywhere in memory, eight separate 
memory reads are often required. Trilinear filtering is perfomied between two mlp-levels, and so four memoiy reads 
occur from one mip-map location and four from another. Textures are usually stored in local memory, although system 
memory texturing Is becoming more popular. These memories have a finite bandwidth and are very often required to 
serve as a resource to memory for many different applications. Set-up parameters, depth Information, and display 
infomiation are usually stored in local memory, and system applications are usually run from system memory. Eight 
individual memory reads per pixel is usually beyond the capabilities of many memoiy systems. Added to page change 
between mip-maps, this often achieves less than adequate 3D performance. 

[0015] The memory bandwidths required for a trilinear texture access system is dependent on the number of memory 
accesses needed for each texture filtering operation and the pixel throughput perfomiance demanded by the applica- 
tion. Equation 1 shows how the texture bandwidth can be detemilned. The equation also shows how the bandwidth of 
page breaks must also be tal<en Into account. 

Bandwidth^ = ((Accesses^ x WfOth^^) HAccesseSp^ ,^^ x Width^^)) x Throughputs^ 
Equation (1) 



Where: 



BandwidU)^^^ Is the texture bandwidth demanded from the memory measured In bytes/s. This is not the memory 
bandwidth that can be supplied by the memory. 

Accesses^i^^i Is the average number of memory accesses per pixel. Not all the required texels can be read in one 

access, even with the right data width. 

^^^^^^PBge^break 's the average number of memory access slots lost to page breaks per pixel. A sinqle oaae 
break using SDRAIW requires at least 8 accesses slots. y h y« 

^^memory the width of the memory data bus, measured in bytes. This has to be at least 8 bytes (64 bits) to 
ensure that four texels can be read in one clock cycle. 

Throughputs^! Is the pixel throughput demanded by the application, measured in pixels/s. For most modem ao- 
piteations this Is around 100 Mplxels/s. 

[0016] The average accesses per pixel is the number of separate memory accesses required to retrieve alt data 
necessary for the filtering operatk^n. Using a 644Dit memory bus, maximum throughput is achieved If four 1 5-bit texels 
are required and they reside In the same data word. Unfortunately this is not always the case and very often the texture 
data resides In two or four separate words. Equation 2 shows how Accesses^f^oan be found, taking into account the 
varying number of accesses for a single texture operation. 



AocessBs,,^^ = ((P^^g^^tege,,^,, x 1) + {Percentage ^^^^^ x 2) ^ {Percentage x 4) x Mipmaps) 
Equation (2) 



Where: 
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Percentage sjngiQ is the percentage of single mip-map accesses that can be retrieved in one memory access. 
Percentage^oubie the percentage of single mip-map accesses that can be retrieved in two memory accesses. 
Percentage^^uadrupio the percentage of single mip-map accesses that can be retrieved In four memory accesses. 
Mipmaps is the number of mip-maps involved in the filter operation. 
5 P&otorcompression the compression factor If the texture Is compressed. 

[001 7] The average number of accesses Is calculated by taking into account the likelihood that all data will be retrieved 
In single, double or quadruple accesses. The equation also takes account of the number of mlp-maps required In the 
filtering operation, trilinear filtering uses two but bilinear requires only one. The equation also shows how the average 
10 number of accesses is reduced llneariy with compression. Obviously the less data there is to fetch the less memory 
accesses that are required and the more likely ail the data will reside in the same data word. 
[001 8} Equation 1 thus shows that, even with the use of texture caches, the physical memory bandwidth requirement 
still remains beyond the scope of any viable memory system. For this reason texture compression is employed, not 
only to reduce the physical size of the stored texture, with its associated reduction in memory cost, but also to reduce 
IS the volume of texture data that is transferred from that memory. 

[0019] Equation 2 shows how the compression factor affects the number of memory accesses. High perfonnance, 
dedicated hardware can be used to decompress the textures In real-time after they have been read from memory. 
Many texture compression techniques have been tried, including: Vector Quantisation (VQ) [ref. 11]; Color lookup 
tables (GLUT) or Palletisation [ref. 12]; Discrete Cosine Transformation (DOT) [ref. 12]; and several proprietary tech- 
no niques [refs. 6 and 14], But each has its associated problems: VQ and Palletised require two memory reads or large 
internal data caches and quality can be limited; DCT requires a large amount of decompression logic, and with a limited 
silicon budget this can be unfeasible; and many proprietary techniques provide limited compression ratios and quality. 
As Equation 1 demonstrates, these techniques only go part way towards resolving the bandwidth requirements of 
trilinear filtering. 

25 [0020] Memory access streams that continuously swap between different memory banks can have a large effect on 
perfomnance. Equation 1 shows how dominant page breaks are to the performance of texture filtering as a whole. For 
trilinear filtering, page breaks can be particularly problematic, where a number of mlp-maps can often span more than 
one memory page. 

[0021] 3D imaging techniques often demand such a high level of perfomnance that only dedicated hardware solutions 
30 can be used. This often requires the development of a special silicon chip. As well as performing texture mapping, the 
chip will often be called upon to perfonn all the geometry processing, polygon set up, Image projection. Illumlnatton 
calculations, fogging calculations, hidden surface removal, and display control. Therefore It is critical that each stage 
in the generation of a 3D image is made as small as possible to enable all processes to fit on the same silicon die. As 
well as requiring a large memory bandwidth, a trilinear filtering operation can only be implemented in a large amount 
35 of logic and therefore silicon area. It is important that an optimum solution be found that limits the required logic, and 
therefore cost, to a minimum. 

[0022] WO-A-9741 534 discloses a method and a system for texturing for use In three dimensional imaging. This has 
a memory for storing mip-map data, an Input for receiving Input data indicating the type of mip-map data, a control for 
retrieving the mip-map data from memory In accordance with the input, a cache for storing portions of mip-map data 
40 retrieved from memory and a trilinear interpolatorf or interpolating an output texel from mip-map data stored in the cache. 
[0023] It is seen from the foregoing that the requirements of memory, speed and ease of construction of the chip are 
very substantial and are taxed to the full in 3D imaging, particulariy when texture mapping. Even using all available 
techniques for meeting the requirements, the constraints are still very difficult to meet if high quality real-time Imaging 
Is to be achieved. 

45 

Summary of the Invention 

[0024] The invention in its various aspects is defined in the independent claims below, to which reference should 
now be made. Advantageous features are set forth in the appendant claims. 

50 [0025] Preferred embodiments of the Invention are described below with reference to the drawings. In these embod- 
iments the efficiency of trilinear texture filtering Is improved by the generation of the lower-level texture mip-map on 
the fly from the upper-level mip-map, and with the use of texture caching and texture decompression as a means of 
meeting the high memory bandwidth requirements. Removing the need to read the lower-level mlp-map In a separate 
memory access, removes page break problems and enhances performance. 

55 [0026] More particulariy, the prefen-ed texturing systems comprise a memory for storing mip-map data for use in 
texturing an Image, the mlp-map data comprising a hierarchical series of mlp-maps of different levels of decreasing 
resolution. Data is receh/ed at an Input Indicating the type of mip-map data required and the level of the mlp-map or 
mlp-maps from which the data Is to be taken. A controller retrieves from the memory the mip-map data required In 
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IfTf^T J '^^l- ^"'^ ^ « '^^^ '°w6r-level mip-map generator generates portions 

tril neaTSrZrJJ'r/ T ?"°"'k'" t'^e '"ip-map of which portions are held in the cache. A 

I^o mTp^maTlevels "^'^ ^""^ '"*«'P'"«»«« °"«P"» ^^^^ ^"^ '"P"* ^xels from the 

Si„«!r^ data compression schemes lend themselves well to the generation of lower mip-maps on the f^ by 
th«^l?r 'T^ '^''^T' °' "^^^"^^^^ <^ata by closely supporting the filtering algorithm used to generate 

m«lrwhi! m „7,L^^ ""^y ""^"^ *° ^"'^^"''^ perfomianoe and quality of a 3D rendered 

image while minimising the hardware overhead. 

[0028] in one prefeired embodiment, the texture data is represented by arbitrary compressed codes, in which se- 
whS^rh'^f^ "J'k^ " ^^^'"^ principal colors and other compressed code values define intemiediate colors 
Which can be formed by selected weighted averages of principal colore, the corresponding code values also being 
weighted averages of the code values of the selected principal colors. The lower-level mip-map generator interpolates 
an output texei from a plurality of Input texels by operating on the compressed code values. '"terpo'^^s 

Brief Description of the Drawings 

[0029] The Invention will now be described in moredetall, by way of example, with reference to the drawings, In virhich: 

Figure 1 is a diagram illustrating bilinear filtering of the texture map In an imaging system 
Figure 2 is a similar diagram Illustrating tiiHnear filtering of the texture map In an Imaging system- 
Figure 3 illustrates a series of mip-maps; » » J 
Figure 4 is a blocic schematic diagram of part of an imaging system in accordance with the present invention- 
Figure 5 IS a diagram illustrating the operation of the embodiment of Figure 4; 
Figure 6 illustrates the storage of texture words in a worst-case scenario- 
Figure 7 is a block schematic diagram similar to Rgure 4 of a second embodiment of the Invention using four 
caches and four decompression units in parallel; 

Figure 8 Illustrates In similar manner to Figure 6 the storage of texture words in a best-case scenario' 
aXLIct.l^'re'aS^^^^ ' °' '^^ '^'^"^^ '^ecompr^slon units 

Figure 10 Is a biocic diagram of another modification of the circuit of Figure 7 and 
Figures 11 and 12 illustrate how Interpolation can be directly applied to compressed texture codes. 

Detailed Description of the Preferred Emtiodiments 

^>^?!!!? ^ embodiment of a texturing system In accordance with the invention. The texturing 

STs contrXn fn'.^ ""^""^ ^ '"""'"^ P^'' °' *^ ^ "'^ i-^^S'^a «y«ten, as a whole, and 

m«n T , ""^""^^ ^ "'"t™«e'-24. The memory 22 holds Inter alia a compressed teiture 

^^STh , S^^r^"*:; « '° ^'^'^'^ P«rt the memory 22 where the desired texture is located The 

texture wril CO respond to a particular texture type representing the type of surface to be displayed. For example thte 
may be part of a brick wall or the surface of a road. This surface texture will be held in mipljp fortJ asSlbed 

of the surface, account being taken of the distance of the surface from the observer. 

e^SS^ J """'"^^ decompressed texels. The decompressed texels are then applied both to a trilinear in! 

t^«x!k ^ ^PP"^** '° •^""^^^ interpolator 34. Thus the Interpolator receives 

the texels from the mip-map ,n front of and that behind the surface point under conslderatton. so that a trilinear Inter- 
polat,on.^as illustrated in Figure 2, can be executed. The trilinear interpolator provides a trilin^arrlrfd pixel S an 

[0032] In operation, therefore, the texture address generator 28 feeds the addresses of the eight trilinear texels to 

unit (DU) 32 then decompresses 16 upper mip-map texels from the compressed data. Four are selected as the ^ 

map texture. This is because four upper-level texels wHI be used to generate each tower-level texel. as descried 
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below. These upper-level and lower-level mlp-maps are then used by the trilinear Interpolator 34 to generate the single 
trilinear pixel. Although as many as 16 texels are required for this technique to work, which, uncompressed, would 
represent a large memory bandwidth, when compressed this represents a considerable memory bandwidth Improve- 
ment with minimum page breaks. 

5 [0033] The lower-level mip-map is generated by four parallel filters or Interpolators 40 that are able to take the 16 
upper-level texels and generate the four lower level in one clock cycle. Rgure 5 Illustrates this process. Out of the 16 
original texels, four are decoded to produce the four upper-level texels which are directly used for trilinear filtering. 
These 16 are also divided into four quadrants, each containing four pixels, which are then filtered by four digital low- 
pass filters to produce the four lower-level mlp-map texels required for trilinear filtering. This filtering algorithm Is shown 

10 in Equation 3: 



Lower-level color valueO = (upperO + upperl + upper4 + uppers) / 4 
Lower-level color valuel (upper2 4- upper3 + upper6 4- upper?) / 4 
Lower-level color value2 = (uppers + upper9 + upperl 2 + upper13) / 4 
Lower-level color values = (upperl 0 + upperl 1 + upperl 4 + upperl 5) / 4 Equation (3) 

20 

[0034] By applying actual values to Equation 1 and Equation 2, the cost of trilinear filtering can be determined. It has 
been determined that in a conventional system, four accesses of 64-bit data are required for each trilinear pixel when 
uncompressed, with an average of two page breaks per pixel. This gives a total bandwidth requirement of 1 6 Gbytes/ 
second for a system that must produce 100 Mpixels per second. Such performance Is unacceptable, but even when 

25 the texture data Is cached and compressed the bandwidth demanded is not much better (13.2 Gbytes/s). However, 
when the double page break is removed by generating the lower mlp-map "on the fly' in the generator 36 of Figure 4, 
in accordance with this invention, the bandwidth is reduced to an acceptable level. Cached and compressed texture 
bandwidth requirements for this system are of the order of only 200 Mbytes/s. A 64-blt memory system running at 100 
MHZ has a peak bandwidth capability of 800 Mbytes/s, so the required 200 Mbytes/s Is well within its capacity. Indeed, 

30 the lower-level mip-map construction system embodying the invention can leave spare bandwidth which will be avail- 
able for other uses. 

[0035] In summary, therefore, it will be seen that the texturing system of Figure 4, which is designed for use as part 
of a three-dimensional imaging system, includes the memory 22 for storing mip-map data for use In texturing an image, 
the mip-map data comprising a hierarchical series of mip-maps of different levels of decreasing resolution. The input 

35 26 receives input data indicating the type of mip-map data required, and the level of the mip-map or mip-maps from 
which the data is to be taken. The controller 24 is coupled to the input and to the memory for retrieving from memory 
the mip-map data required In accordance with the input data. The cache 30 is coupled to the controller for storing 
portions of mlp-map data retrieved from memory which relate to a selected mip-map level. The trilinear Interpolator 34 
is coupled to the cache to receive mip-map data from one level of mip-map, namely the upper level of the two levels 

40 from which data is needed, and to interpolate an output texel from Input texels from two mip-map levels. In accordance 
with the invention, the texturing system also includes the lower-level mip-map generator 36 which is coupled between 
the cache and the trilinear interpolator, and generates in real-time portions of the mip-map next below (in the hierarchical 
series) the mip-map of which portions are held in the cache. 

[0036] The embodiment of the invention Illustrated in Figure 4 shows how one cache 30 and one decompression 
45 unit (DU) 32 are required to return all data necessary to construct a complete trilinear Interpolated pixel. A problem 
exists, however, when the filter algorithm needs data from two or four additional data words. Figure 6 shows the worse 
case situation. Consider the texture bitmap divided into tiles, with each tile representing the texture pixels provided by 
one compressed data word. In this example the data word contains 16 texture pixels. The darker outlined boxes in 
Figure 6 represent the tiles. In this example, the filter algorithm requires texture pixels from four separate compressed 
so data words. For a single cache/decompression system, this worst case situation requires four cache reads and thereto re 
a minimum of four clock cycles. This problem also exists, to a lesser extent, when two separate data words are required. 
[0037] A four-cache and four-DU construction can provide all requisite data in one clock period, no matter where the 
compressed data resides. Figure 7 shows a block diagram of a second embodiment of the invention which comprises 
such a 'quad cache' arrangement. The texturing system of Figure 7 is similar to the system of Figure 4 and only the 
55 differences will be described in detail. In the second embodiment of Figure 7, the memory system 22 and input 26 are 
the same as in Figure 4. In this case the texture address generator 28 has four outputs to cope with the situation where 
the outputs of a maximum of four tiles are required. The four outputs provide respectively texture addresses 1 to 4, 
and these are applied to texture caches 1 to 4, referenced 60. These communicate to and from the memory controller 
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!1 tK^ an arbtter 58. so as to receive the texels from the four tRes respectively. These are heid in the four caches. 
mn^« 4.° " ^°"''°«°'^ecl compressed textures, namely cached compressed texture 1 to 4 respectively. 
100381 The outputs of the four caches 60 are respectively applied to four decompression units 62. each similar to 
ril^r T'^^f ""'L , ^^''^ decompression unit 62 applies its decompressed data to upper and lower 

o^rf ""1 K • • r '° " low-level mip-map filter 66. The outputs of the upper and lower decode units 68.70 

are respectively applied to a trilinear Interpolator 64. 

[0039] In operation, each cache 60 is responsible for a different segment of the memoiy. based on the size of a 
cacheword and*elow6rtv«> bits Of the texture address rep^^^ 

rm^n, Sl!^*; .! * T. '^P"" *° ""^'^ a '^o'^Plete trilinear pixel. 

S H f "^""^.''^'"S t° a^^^'-ats the lower-level mip-map from the upper one. eight caches and DUs would be 
•on ti^t w^t'"'"^ fourforthe lowermap. Thus, the generation of lower-level mip-mapsas required 

on the fly saves logic (silicon area) as well as improving perfoimance. When coupled with compression, the amount 
of data required IS also greatly reduced. omouni 
S^iILf K ^ Tf!!'^^®" tower-level mip-maps are still stored in the memory. They will be required for 

to tanol th.rnI?Jn"^^^^ " P^"^ *° '""^ interpolation 

gener^torl S regenerate those texels which are needed for the pixels being processed using the 

100421 One of the major advantages of generating lower-level mlp^naps from compressed upper-level mip-maps is 
the amount of logic optimisation that can take place. Less logic leads to smaller hardware and lower cost. OneTuch 
area of optimisation is In the caches and decompression units. 

E^LII^m"!!® * ^ ''^'^^ ^y^*^-^ effectively memory mapped. I.e. each cache holds data for a 

«fwlu M ""^^'^ 22 do not necessarily have to be memory mapped 

K nn i!^"'"^ K, '^'^""■^ ^^"^ associated with a particular cache. This woul^ also ne- 

cessitate each DU being able to produce at least four upper-level texele as well as four lower-level texels in one cycle 
date word' situation. The best case situation is illustrated in Figure 8 where all the tiles come from one 

S,timLtS«'^*'?l^^.*' ^l**^ dynamically reallocated based on the resources required, then several hardware 
S^th^L Tn I to dramatfoally reduce the size of the hardware required. We have detemnined that not 

Tnort!!. t , P^"^"^' "PP^r-'^^^' texels, not more than two units need provide two parallel 

upper-level texels. and none need provide more than one lower-level texel 

9 rnfr '^^^ °P»™jsations fit into a modified quad cache system. The modified arrangement of 

F gure 9 is similar to the arrangement of Figure 7 and only the differences need be described. TTiese are the inclusion 
il^uts ofZTf T (tJU) allocator 82. which is connected between the outputs of the four caches eoTr^d Je 
S« ♦°";decornpression units 62. and a corresponding deallocator 84. which replaces the upper and lower 
• th^trit^«H„S^ I °' '•^^ 22 and the filters 66. aJJthe inputs to 

teLTJ ?hf.i!^^H nT" ^® "° P^""«^ P™>^de four pamllel upper-level 

secondary pu ««n provide two parallel upper-level texels. and the tertiary and quadra DU can each provide 
only one upper- evel texel. The DU allocation unit 82 detemiines which of the DUs 62 am required, based oS the 

40 ^Zl^n:^ !^\''T' "'^'^r "-"^ '^^ P'*^ '^•^ °" ^«g'=*«^«^ °"«P"^ each DU and he 

40 multiplexing of the texture output from each DU. The deallocation unit 84 now takes over from the upper and lower 

?nn«? ^ anthmetic used to filter the four lower-level mip-map texels Is also simplified 

!.n thi mLn^^?"^"^- ^,'1^^°^" "elng deallocated by the deallocator 84. but this may not be necessary depending 
4S cTea^ tor;ron^'2:fnr r "^^^'^'"'-"^'-^ ^ - .mplementatlona. detail LcE will bS 

Smi/""'^!!" ^'^^ optlmisatton is In the filters 40 (Rg. 5) that generate the lower-level mip-map. If the 

tZ ^ T,^ TTT^ components can be arranged In such a way as to enable ease of lower mip-map genei^on 
50 Idtlt tSm^Jl l"""^ r 9^"^^^*®^ «^enly sampling the uncompressed texture over its colorrange. the 
es^r^TnT! n tK*"^" ^ '"P"' '° '"'P-'^^P f a®"^'^^^ t*'^ lower-level mip-m^. The 

LnthSo onTh H '""^ ^""'"^''"^ °' te^"'^- The bit Width of fiftering 

arrthmetic on the indexes (compressed texture) is much smaller than had the filter been applied to the uncompressed 
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lo?!!^ f^'T an arrangement is illustrated In Figure 10 of the drawings. This is based on Figure 7, though It could of 
course also Include the modification of Figure 9. » » uiuugn couia ot 

6?J!i .ITZ? T""^ T "^'^ ^^^^^'^ '""'^^ decompression unit 62. Each decompression unit 

t^m^V^t^l^^^^ ^^^T ^"^^^^^ to provide indices which aVe filtered by 

the filter 66 before being applied to a codebook lookup section 94 where the filtered indices are used to find the cor- 
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responding codebook portion. 

[0050] Figure 11 diagrammatlcally iliustrates the relationship between the compressed code values and the colors 
they represent. The diagram is assumed to be a part of the color triangle representing all possible colors on a two- 
dimensional plane. Two colors CI and C2 are denoted as principal colors. They define the ends of a line on which 
5 other Intermediate colors C11 to Cli lie. These other colors can therefore be fonmed by talking an appropriate weighted 
average of the colors CI and C2. The code values for the colors C11 to Cli are likewise defined by averaging the code 
values of the colors C1 and C2. 

[0051 ] Thie normal way of producing the intermediate colors is to start with the compressed code valu'^s of the colors 
CI and C2 and to determine the actual color values of the colors C1 and C2 to the accuracy of the syst:im, which may 

10 require 16 or even 24 bits per color. A weighted average is then perfomned on the 16 or 24 colors values thus obtained. 
However, we have appreciated that when the relationship between the colors and their code values is defined in the 
way just described, It Is possible to average the code values first and then subsequently to convert the averaged code 
value thus obtained to the nearest available color. This means that the averaging operation no longer has to be done 
with 1 6 or more bits of data, but can now be done with just the two or so bits which are required to define the different 

IS compressed code values In use. 

[0052] This lower mip-map optimisation technique can also be applied to Direct X compressed textures [ref. 14]. 
Here the two primary colors and the Interpolated Intermediate values pointed to by indexes are spaced evenly apart. 
Figure 12 shows how 2-blt Indexes are mapped to 16-bit texture colors. It is seen that the 2-bit Indexes themselves 
can be filtered Instead of the full 16-bit value. This represents much less logic and therefore less overall cost. 

20 [0053] In summary, the illustrated method of generating texture data for use in texturing an Image, is seen to comprise 
first representing texture data by arbitrary compressed codes, in which selected compressed code values define prin- 
cipal colors and other compressed code values define colors which can be formed by selected weighted averages of 
principal colors, the corresponding code values also being weighted averages of the code values of the selected prin- 
cipal colors. Then an output texel is Interpolated from a plurality of input texels, with the Interpolating step being effected 

25 using compressed code values. The code values are subsequently decompressed to give the actual color values. 

[0054] This method finds application for other purposes than just In the lower-level mip-map generator illustrated. It 
can be used whenever it is desired to generate intermediate colors which are held in compressed form, provided that 
the compressed codes bear a linear relationship which parallels that of the adtua\ colors themselves. In particular, it 
can be used in the original generation of the mip-maps held in the memory 22. 

30 [0055] The embodiments of the invention which have been described and illustrated provide various improvements 
over the known systems. In particular, efficient use is made of memory bandwidth for trilinear filtering, especially when 
using texture compression caches. By generating the lower-level mip-map data from compressed upper-level mip-map 
data as it is required, accesses across page breaks are reduced. A quad cache arrangement can be used to guarantee 
one trilinear filtered pixel per clock period. The operation of the decompression iogte is improved by using dynamic 

35 decompression resource allocation. Finally, the unification of the lower-Iwel mip-map generation with the decompres- 
sion logic provides a system with relatively low overhead. 

[0056] It will be appreciated that many modifications may be made to the systems described and illustrate which 
represent only selected and presently preferred embodiments of the invention. 

40 

Claims 

1 . Apparatus for generating texture data for use In texturing an image, comprising the steps of: 

45 means for representing texture data by arbitrary compressed codes, In which selected compressed code val- 

ues define principal colors and other compressed code values define colors which can be formed by selected 
weighted averages of principal colors, the corresponding code values also being weighted averages of the 
code values of the selected principal colors; and 

interpolating means for interpolating an output texel from a plurality of input texels; 

50 

characterised In that: 

the Interpolating means effects the interpolation using compresses code values. 

55 2. A nnethod of generating texture data for use in texturing an image, connprising the steps of: 

representing texture data by ariaitrary compressed codes. In which selected compressed code values define 
principal colors and other compressed code values define colors whteh can be fonmed by selected weighted 
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averages of principal colors, the corresponding code values also being weighted averages of the code values 
of the selected principal colors; 

interpolating an output texel from a plurality of input texels; 
s characterised in that: 

the interpolating step is effected using compressed code values. 
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